Computing and Evaluating Syntactic Complexity Features for Automated Scoring of Spontaneous Non-Native Speech
نویسندگان
چکیده
This paper focuses on identifying, extracting and evaluating features related to syntactic complexity of spontaneous spoken responses as part of an effort to expand the current feature set of an automated speech scoring system in order to cover additional aspects considered important in the construct of communicative competence. Our goal is to find effective features, selected from a large set of features proposed previously and some new features designed in analogous ways from a syntactic complexity perspective that correlate well with human ratings of the same spoken responses, and to build automatic scoring models based on the most promising features by using machine learning methods. On human transcriptions with manually annotated clause and sentence boundaries, our best scoring model achieves an overall Pearson correlation with human rater scores of r=0.49 on an unseen test set, whereas correlations of models using sentence or clause boundaries from automated classifiers are around r=0.2.
منابع مشابه
Modeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses
This study describes an approach for modeling the discourse coherence of spontaneous spoken responses in the context of automated assessment of non-native speech. Although the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spontaneous spoken language, little prior research has been done to assess a speaker’s coherence in the context of a...
متن کاملShallow Analysis Based Assessment of Syntactic Complexity for Automated Speech Scoring
Designing measures that capture various aspects of language ability is a central task in the design of systems for automatic scoring of spontaneous speech. In this study, we address a key aspect of language proficiency assessment – syntactic complexity. We propose a novel measure of syntactic complexity for spontaneous speech that shows optimum empirical performance on real world data in multip...
متن کاملAutomatic scoring of non-native children's spoken language proficiency
In this study, we aim to automatically score the spoken responses from an international English assessment targeted to non-native English-speaking children aged 8 years and above. In contrast to most previous studies focusing on scoring of adult non-native English speech, we explored automated scoring of child language assessment. We developed automated scoring models based on a large set of fe...
متن کاملUsing an Ontology for Improved Automated Content Scoring of Spontaneous Non-Native Speech
This paper presents an exploration into automated content scoring of non-native spontaneous speech using ontology-based information to enhance a vector space approach. We use content vector analysis as a baseline and evaluate the correlations between human rater proficiency scores and two cosine-similarity-based features, previously used in the context of automated essay scoring. We use two ont...
متن کاملSpeechraterTM: a construct-driven approach to scoring spontaneous non-native speech
This paper presents an overview of the SpeechRater system of Educational Testing Service (ETS), a fully operational automated scoring system for non-native spontaneous speech employed in a practice context. This novel system stands in contrast to most prior speech scoring systems which focus on fairly predictable, low entropy speech such as read-aloud speech or short and predictable responses. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011